Goto

Collaborating Authors

 communication budget






SA-PEF: Step-Ahead Partial Error Feedback for Efficient Federated Learning

Redie, Dawit Kiros, Arablouei, Reza, Werner, Stefan

arXiv.org Machine Learning

Biased gradient compression with error feedback (EF) reduces communication in federated learning (FL), but under non-IID data, the residual error can decay slowly, causing gradient mismatch and stalled progress in the early rounds. We propose step-ahead partial error feedback (SA-PEF), which integrates step-ahead (SA) correction with partial error feedback (PEF). SA-PEF recovers EF when the step-ahead coefficient α = 0 and step-ahead EF (SAEF) when α = 1. For non-convex objectives and δ-contractive compressors, we establish a second-moment bound and a residual recursion that guarantee convergence to stationar-ity under heterogeneous data and partial client participation. To balance SAEF's rapid warm-up with EF's long-term stability, we select α near its theory-predicted optimum. Experiments across diverse architectures and datasets show that SA-PEF consistently reaches target accuracy faster than EF. Modern large-scale machine learning increasingly relies on distributed computation, where both data and compute are spread across many devices. Federated learning (FL) enables model training in this setting without centralizing raw data, enhancing privacy and scalability under heterogeneous client distributions (McMahan et al., 2017; Kairouz et al., 2021). In each synchronous FL round, the server broadcasts the current global model to a subset of clients. These clients perform several steps of stochastic gradient descent (SGD) on their local data and return updates to the server, which aggregates them to form the next global iterate (Huang et al., 2022; Wang & Ji, 2022; Li et al., 2024). Although FL leverages rich distributed data, it faces two key challenges.


Prediction-space knowledge markets for communication-efficient federated learning on multimedia tasks

Du, Wenzhang

arXiv.org Artificial Intelligence

Federated learning (FL) enables collaborative training over distributed multimedia data but suffers acutely from statistical heterogeneity and communication constraints, especially when clients deploy large models. Classic parameter-averaging methods such as FedAvg transmit full model weights and can diverge under nonindependent and identically distributed (non-IID) data. We propose KTA v2, a prediction-space knowledge trading market for FL. Each round, clients locally train on their private data, then share only logits on a small public reference set. The server constructs a client-client similarity graph in prediction space, combines it with reference-set accuracy to form per-client teacher ensembles, and sends back personalized soft targets for a second-stage distillation update. This two-stage procedure can be interpreted as approximate block-coordinate descent on a unified objective with prediction-space regularization. Experiments on FEMNIST, CIFAR-10 and AG News show that, under comparable or much lower communication budgets, KTA v2 consistently outperforms a local-only baseline and strong parameter-based methods (FedAvg, FedProx), and substantially improves over a FedMD-style global teacher. On CIFAR-10 with ResNet-18, KTA v2 reaches 57.7% test accuracy using approximately 1/1100 of FedAvg's communication, while on AG News it attains 89.3% accuracy with approximately 1/300 of FedAvg's traffic.


7f975a56c761db6506eca0b37ce6ec87-Reviews.html

Neural Information Processing Systems

"NIPS 2013 Neural Information Processing Systems December 5 - 10, Lake Tahoe, Nevada, USA",,, "Paper ID:","1011" "Title:","Distributed k-means and k-median clustering on general communication topologies" Reviews First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper provides provably efficient algorithms for performing k-means and k-median clustering in the distributed setting. The main focus of the paper is minimizing communication cost in the distributed network. Although, i am not very much aware of the literature, the paper seems to provide a very novel idea of distributed coresets that leads to clustering algorithms which provably improves the state of the art communication complexity significantly. Existing approaches only use the idea of approximating coresets by taking the union of local coresets.


A Separate Quantization and Privatization Is Strictly Sub optimal

Neural Information Processing Systems

First let us recap the subset selection (SS) scheme proposed by [51]. In the achievability part of Theorem 2.1, our proposed scheme SQKR randomly and independently samples We summarize it in the following corollary: Corollary B.2 The achievability parts of Corollary B.1 and Corollary B.2 follow directly from the analysis of SQKR Note that the red line in Figure 3 can be achieved by RHR. A scheme is consistent if it has vanishing estimation error as n!1 . O (min ( d " e log d, d)) bits of communication to achieve r Similarly, the estimation error of private-coin RHR is characterized below: Corollary B.4 (Private-coin RHR) We implement our mean estimation scheme Subsampled and Quantized Kashin's Response (SQKR) We construct the tight frame by using the random partial Fourier matrices in [36]. It can be shown that the tight frame based on U has Kashin's level K = O (1) . Compare to optimal " -LDP scheme [13] Figure 4: ` SQKR achieves similar performance with significantly communication budgets.